avocado: A Variant Caller, Distributed
نویسندگان
چکیده
In this paper, we present avocado, a distributed variant caller built on top of ADAM and Spark. avocado’s goal is to provide both high performance and high accuracy in an open source variant calling framework. To achieve this, we implement both local assembly and pileup-based single nucleotide polymorphism (SNP) calling. A key innovation presented in our work involves the development of heuristics for when to choose more expensive assembly-based methods instead of pileup-based methods. Additionally, we introduce the concept of “significant statistics,” a tool for performing incremental joint variant calling.
منابع مشابه
Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies
A diversity of tools is available for identification of variants from genome sequence data. Given the current complexity of incorporating external software into a genome analysis infrastructure, a tendency exists to rely on the results from a single tool alone. The quality of the output variant calls is highly variable however, depending on factors such as sequence library quality as well as th...
متن کاملSymplastic Solute Transport and Avocado Fruit Development: A Decline in Cytokinin/ABA Ratio is Related to Appearance of the Hass Small Fruit Variant
Studies on the effect of fruit size on endogenous ABA and isopentenyladenine (iP) in developing avocado (Persea americana Mill. cv. Hass) fruit revealed that ABA content was negatively correlated with fruit size whilst the iP/ABA ratio showed a linear relationship with increasing size of fruit harvested 226 d after full bloom. The effect of this change in hormone balance on the relationship bet...
متن کامل16GT: a fast and sensitive variant caller using a 16-genotype probabilistic model
16GT is a variant caller for Illumina whole-genome and whole-exome sequencing data. It uses a new 16-genotype probabilistic model to unify single nucleotide polymorphism and insertion and deletion calling in a single variant calling algorithm. In benchmark comparisons with 5 other widely used variant callers on a modern 36-core server, 16GT demonstrated improved sensitivity in calling single nu...
متن کاملEvaluating the performance of tools used to call minority
High-throughput whole genome sequencing facilitates Background: investigation of minority sub-populations from virus positive samples. Minority variants are useful in understanding within and between host diversity, population dynamics and can potentially help to elucidate person-person transmission chains. Several minority variant callers have been developed to describe the minority variants s...
متن کاملEvaluating the performance of tools used to call minority
High-throughput whole genome sequencing facilitates Background: investigation of minority sub-populations from virus positive samples. Minority variants are useful in understanding within and between host diversity, population dynamics and can potentially help to elucidate person-person transmission chains. Several minority variant callers have been developed to describe the minority variants s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013